REMARKS/ARGUMENTS 



Claims 1-77 are pending in the application. Claims 27, 35, 37-38,46, 49, 52, 55, 62-64, 71, and 
73-77 are amended herein. The Applicant hereby requests further examination and reconsideration of the 
application in view of the foregoing amendments and these remarks. 

Unacknowledged Reference 

Accompanying the final office action was a copy of the Form PTO-1449 identifying References 
JA-JE cited by the Applicant in a previous IDS. The Examiner acknowledged consideration of 
References JA-JD by initialing those citations. For some reason, Reference JE was not initialed. The 
Applicant requests that the Examiner send another copy of that Form PTO-1449 with Reference JE 
initialed to acknowledge consideration of that reference. 

Claim Rejections 

In paragraph 2 of the final office action, the Examiner rejected claims 1-10, 12-23, and 25-77 
under 35 U.S.C. 103(a) as being unpatentable over Ten Kate in view of Shaffer, and further in view of 
Moon. In paragraph 3, the Examiner rejected claims 1 1 and 24 under 35 U.S.C. 103(a) as being 
unpatentable over Ten Kate in view of Shaffer, further in view of Moon, and further in view of 
Jafarkhani. For the following reasons, the Applicant submits that all of the now-pending claims are 
allowable over the cited references. 

Telephonic Interview 

On 02/01/07, the Examiner participated in a telephonic interview with the Applicant's attorney 
Steve Mendelsohn. The Applicant thanks the Examiner for the courtesy of that interview. 

During the interview, the Examiner stated that amending claim 27 to recite that "an audio 
decoder is enabled to generate more than E different playback audio channels based on only the E 
transmitted channels and the one or more cue codes" would distinguish over Ten Kate and therefore 
would overcome the pending rejection of claim 27 under 35 U.S.C. 103(a). 

In the event that the Examiner believes that this amendment does not place the application in 
condition for allowance, the Applicant requests a further telephonic interview between the Examiner and 
the Applicant's attorney Steve Mendelsohn. The Applicant requests that the Examiner call Mr. 
Mendelsohn (215-557-6657) to arrange a convenient time for such an interview. 

Claims 27, 37. 38. 49. and 52 

Claim 27 has been amended to clarify that an audio decoder is enabled to generate more than E 
different playback audio channels based on only the E transmitted channels and the one or more cue 
codes. The purpose of this amendment is to distinguish over the teachings in Ten Kate. 

In particular, Ten Kate teaches an audio system in which the number of playback audio channels 
is equal to the number of transmitted channels. The purpose of this amendment to claim 27 is to 
foreclose interpretation of a subset of Ten Kate's transmitted channels as being an example of the E 
transmitted channels of claim 27. 
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For example, Ten Kate teaches as 3 -channel decoder that receives three transmitted audio 
channels (i.e., the composite signal, the one selected combination signal, and the one selected original 
signal) and generates three playback audio channels. The current amendment to claim 27 forecloses 
interpretation of one or two of those three transmitted audio channels as being an example of the E 
transmitted channels of claim 27 by explicitly reciting, in claim 27, that the audio decoder is enabled to 
generate more than E different playback audio channels based on only the E transmitted channels and the 
one or more cue codes. 

Claim 27 has been further amended to clarify that one or more cue codes are generated for each 
of two or more different frequency bands in the two or more input channels in the frequency domain. 
This amendment further distinguishes the claimed invention over the cited references because none of 
those references teaches or even suggests the generation of cue codes where one or more cue codes are 
generated for each of two or more different frequency bands. 

In view of the foregoing, the Applicant submits that currently amended claim 27 is allowable 
over the cited references. For similar reasons, the Applicant submits that currently amended claims 37, 
38, 49, and 52 are allowable over the cited references. Since the rest of the claims depend variously from 
claims 27, 37, 38, 49, and 52, it is further submitted that those claims are also allowable over the cited 
references. 

Claims 35 and 46 

According to currently amended claim 35, the downmixing comprises, for each of two or more 
different frequency bands, downmixing the two or more input channels in the frequency domain into one 
or more downmixed channels in the frequency domain. In rejecting claim 35, the Examiner suggested 
that Shaffer teaches downmixing in the frequency domain, stating that "the decoder can apply the balance 
parameter to all received frequencies' 1 and citing column 5, lines 56-67. 

First of all, whether or not Shaffer's decoder can apply the balance parameter to all received 
frequencies is completely irrelevant to whether or not the downmixing is performed in the frequency 
domain. In Shaffer (and in the present invention), downmixing is performed at the encoder , not at the 
decoder . See, e.g., adder 44 in Shaffer's Fig. 3. Significantly, the only downmixing taught in Shaffer is 
performed in the time domain, not in the frequency domain. See, e.g., column 6, lines 18-26. 

There is no teaching or even suggestion in Shaffer for performing downmixing in the frequency 
domain. As such, the Applicant submits that this provides additional reasons for the allowability of 
claim 35 and also currently amended claim 46 over the cited references. 

Claims 36 and 47 

According to claim 36, the downmixing further comprises converting the one or more 
downmixed channels from the frequency domain into one or more of the transmitted channels in the time 
domain. Thus, according to claim 36, which depends from claim 35, after the downmixed channels are 
generated in the frequency domain, they are converted into transmitted channels in the time domain. In 
rejecting claim 36, the Examiner suggested that "time shifting" is related to converting downmixed 
channels in the frequency domain into transmitted channels in the time domain, citing column 6, lines 18- 
26, of Shaffer. 

First of all, the time shifting mentioned in column 6, lines 23-24, refers to a technique for 
generating a single sample stream from left and right sample streams in which one of the input sample 
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streams is shifted in time relative to the other input sample stream to compensate for a characterized time 
delay between the two input streams before the input streams are combined to generate the single sample 
stream. This processing is implemented entirely in the time domain and has absolutely nothing to do 
with converting audio channels from a frequency domain into a time domain. 

Moreover, this time shifting is performed before the two input streams are combined to generate 
the single (downmixed) sample stream. As such, this time shifting is performed before the downmixed 
sample stream is even generated. 

In fact, there is no teaching or even suggestion in Shaffer for converting downmixed channels 
from a frequency domain into a time domain. The Applicant submits that this provides additional 
reasons for the allowability of claims 36 and 47 over the cited references. 

Claims 3 and 16 

According to claim 3, each set of one or more auditory scene parameters corresponds to a 
different audio source in the auditory scene. In rejecting claim 3, the Examiner cited Shaffer's interaural 
level differences (ILD) and interaural time delays (ITD) and column 4, lines 38-44. While it is true that 
Shaffer's ILD and ITD are two different auditory scene parameters, the number of different auditory 
scene parameters has nothing to do with the number of audio sources in an auditory scene for which the 
different auditory scene parameters are derived. An audio source in an auditory scene refers to the 
location in physical space of the origin of sound arriving at the microphones that are used to generate the 
auditory scene parameters. 

According to claim 2, from which claim 3 depends, each set of auditory scene parameters 
corresponds to a different frequency band in the combined audio signal. Claim 3 adds the limitation that 
each set of auditory scene parameters corresponds to a different audio source in the auditory scene. 
Thus, if the sound arriving at the microphones in an auditory scene originates at multiple locations (e.g., 
when someone is speaking in a conference room having an air conditioner running and a car passing by 
outside), according to claim 3, each set of auditory scene parameters corresponds to a different frequency 
band (e.g., one set of auditory scene parameters in a first frequency band corresponds to the speaker, 
another set in a second frequency band corresponds to the air conditioner, and yet another set in a third 
frequency band corresponds to the car). 

Clearly, Shaffer does not teach or even suggest the features recited in claim 3. The Applicant 
submits that this provides additional reasons for the allowability of claim 3 over the cited references. 
Similarly, the Applicant submits that this provides additional reasons for the allowability of claim 16. 

Claims 4 and 17 

According to claim 4,for at least one of the sets of one or more auditory scene parameters, at 
least one of the auditory scene parameters corresponds to a combination of two or more different audio 
sources in the auditory scene that takes into account relative dominance of the two or more different 
audio sources in the auditory scene. In rejecting claim 4, the Examiner cited Shaffer's use of cross- 
correlation, Fig. 7, and column 8, line 43-57. The Applicant submits that Shaffer's use of cross- 
correlation has nothing to do with the features recited in claim 4. 

According to claim 4, at least one auditory scene parameter corresponds to sound coming from 
two or more different locations in the auditory scene, where the relative dominance (e.g., which location 
is louder than the other(s)) of the audio sources is taken into account. Shaffer uses cross-correlation 
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between two different audio channels to determine the relative time delay between the two channels. 
This has nothing to do with the relative dominance of the two audio channels. Moreover, Shaffer's two 
audio channels are left and right stereo channels, which is independent of whether the souhd in those 
stereo audio channels comes from one audio source or multiple audio sources. 

Clearly, Shaffer does not teach or even suggest the features recited in claim 4. The Applicant 
submits that this provides additional reasons for the allowability of claim 4 over the cited references. 
Similarly, the Applicant submits that this provides additional reasons for the allowability of claim 17. 

Claims 7 and 20 

According to claim 7, the combined audio signal corresponds to a combination of two or more 
different mono source signals, wherein the two or more different frequency bands are selected by 
comparing magnitudes of the two or more different mono source signals, wherein, for each of the two or 
more different frequency bands, one of the mono source signals dominates the one or more other mono 
source signals. In rejecting claim 7, the Examiner cited both Ten Kate and Shaffer. 

In particular, the Examiner suggested that Ten Kate teaches a combined audio signal 
corresponding to a combination of two or more different mono source signals, citing column 2, lines 33- 
56. Ten Kate teaches a combined audio signal corresponding to a combination of three audio signals, but 
they are not three different mono source signals. Three different mono source signals refers to three 
different mono audio signals, each coming from a different location in an auditory scene. In Ten Kate, 
the three audio signals are the left (L), right (R), and center (C) channels of a 3-channel audio system, 
where the sound in those three different channels all come from the same audio source(s) in an auditory 
scene. Thus, Ten Kate does not teach or even suggest a combined audio signal corresponding to a 
combination of two or more different mono source signals. 

The Examiner also suggested that Shaffer teaches that "for each of the two or more different 
frequency bands, one of the mono source signals dominates the one or more other mono source signals," 
again citing Shaffer's use of cross-correlation. For at least some of the same reasons given in the 
previous section, the Applicant submits that Shaffer's use of cross-correlation has nothing to do with 
relative dominance of audio sources. Moreover, Shaffer uses cross-correlation to determine relative time 
delays between audio channels. There is no teaching or even suggestion in Shaffer for using cross- 
correlation to select different frequency bands. 

The Applicant submits that this provides additional reasons for the allowability of claim 7 over 
the cited references. Similarly, the Applicant submits that this provides additional reasons for the 
allowability of claim 20. 

Claims 8 and 21 

According to claim 8, the combined audio signal corresponds to a combination of left and right 
audio signals of a binaural signal, wherein each different set of one or more auditory scene parameters is 
generated by comparing the left and right audio signals in a corresponding frequency band. In rejecting 
claim 8, the Examiner suggested that Shaffer teaches generating auditory scene parameters "by 
comparing the left and right audio signals in ... corresponding frequency bands," citing Shaffer's 
"subband coding" and column 3, lines 43-47. 
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First of all, Shaffer teaches the generation of ITD parameters using only time-domain techniques. 
There is no teaching in Shaffer for generating ITD parameters in the frequency domain, let alone 
generating different ITD parameters in different frequency bands. 

Moreover, the subband coding described in Shaffer is performed by signal encoder 46 of Fig. 4 
on the single sample stream generated by adder 44. Since, at this point, there is only one audio channel, 
there simply cannot be any comparison between left and right audio signals in Shaffer's subband coding. 

The Applicant submits that this provides additional reasons for the allowability of claim 8 over 
the cited references. Similarly, the Applicant submits that this provides additional reasons for the 
allowability of claim 21. 

Claims 10 and 23 

According to claim 10, step (b) comprises the step of applying a layered coding technique in 
which stronger error protection is provided to the combined audio signal than to the auditory scene 
parameters when generating the embedded audio signal, such that errors due to transmission over a lossy 
channel will tend to affect the auditory scene parameters before affecting the combined audio signal to 
improve the probability of the first receiver to process at least the combined audio signal. 

In rejecting claim 10, the Examiner suggested that Shaffer teaches such a layered coding 
technique, citing column 3, lines 63-67; column 1, lines 21-24; voice packets 50 in Fig. 4; encoder 24 and 
decoder 30 of Fig. 1; Fig. 7; and column 8, lines 43-50. The Applicant submits that the Examiner 
mischaracterized the teachings in Shaffer in rejecting claim 10. 

Column 3, lines 63-67, describes packets being encapsulated with "lower layer headers." 
Column 1, lines 21-24, suggests that packet headers can contain "error correction information." Fig. 4 
indicates the generation of "voice packets 50." These teachings suggest, at most, that, by using voice 
packets with headers that contain error correction information, Shaffer's encoder 24 and decoder 30 work 
together to recover from some errors due to transmission over a lossy channel. 

Significantly, however, there is no teaching or suggestion in Shaffer of a layered coding 
technique in which stronger error protection is provided to the combined audio signal than to the auditory 
scene parameters when generating the embedded audio signal, such that errors due to transmission over a 
lossy channel will tend to affect the auditory scene parameters before affecting the combined audio signal 
to improve the probability of the first receiver to process at least the combined audio signal. The 
Examiner's bald assertion that such features are obvious in light of Shaffer's teachings is completely 
unsupported by any teachings in Shaffer and therefore improper. 

Note that Fig. 7 and column 8, lines 43-50, relate to cross-correlation processing used to 
determine the time delay between the left and right audio channels. These teachings have absolutely 
nothing to do with error protection. 

The Applicant submits that this provides additional reasons for the allowability of claim 10 over 
the cited references. Similarly, the Applicant submits that this provides additional reasons for the 
allowability of claim 23. 
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Claim 55 and 63 



According to currently amended claim 55, for each of two or more different frequency bands, 
one or more of the E transmitted channels are upmixed in a frequency domain to generate two or more of 
M playback channels in the frequency domain, where M>E^ 1 . The one or more cue codes are applied to 
each of the two or more different frequency bands in the two or more playback channels in the frequency 
domain to generate two or more modified channels, and the two or more modified channels are converted 
from the frequency domain into a time domain. 

In rejecting claim 55, the Examiner appears to have combined different passages in Ten Kate, 
Shaffer, and Moon based on the appearance of individual words that are also recited in claim 55, with 
little or no regard for the teachings in those passages or the lack of motivation for such combinations. 

For example, the Examiner cited Moon, column 7, lines 42-57, as being related to the upmixing 
of claim 55. According to claim 55, upmixing is applied to one or more transmitted channels to generate 
two or more playback channels. In this context, upmixing is related to a technique for increasing the 
number of channels by variously duplicating and/or combining the input channels. Moon teaches a 
completely different type of upmixing. The upmixing taught in Moon relates to the conversion of a 
single input channel in one frequency into a single output channel of a higher frequency. In this type of 
upmixing, the number of channels does not change; only the channel frequency changes. 

Similarly, the Examiner cited the "subband coding" mentioned in Ten Kate, column 6, lines 47- 
59, as being related to the fact that the upmixing of claim 55 is performed in the frequency domain. Like 
the subband coding taught in Shaffer described previously, the "subband coding" taught in Ten Kate 
refers to the conventional data compression technique applied to a single sample stream. It has nothing 
to do with upmixing one or more input channels to generate a greater number of output channels. 

Here, too, the Examiner cited Ten Kate, column 2, lines 28-38, as teaching the generation of 
more playback channels than the number of transmitted channels. As described previously with regard to 
claim 27, Ten Kate does not teach or even suggest the generation of more playback channels than the 
number of transmitted channels. 

The Examiner cited Shaffer, column 1, line 60, to column 2, line 7, and Figs. 4 and 8 as being 
related to the application of cue codes to different frequency bands in the playback channels in the 
frequency domain to generate two or more modified channels, recited in claim 55. The passage cited by 
the Examiner teaches the application of a directional cue, but there is no teaching or suggestion that this 
processing is performed on different frequency bands in the frequency domain. Fig. 4 shows a packet 
format, and Fig. 8 shows Shaffer's decoder. Neither of these figures shows or suggests any frequency- 
domain processing. 

Regarding the conversion of the modified channels from the frequency domain into a time 
domain, the Examiner cited Shaffer, column 4, lines 1 1-22. The teachings in this passage relate to the 
handling of transmitted data packets and has absolutely nothing to do with the conversion of channels 
from a frequency domain into a time domain. 

Even if the Examiner were correct about the individual teachings in the various references 
(which the Applicant explicitly and emphatically denies), the fact remains that there is no motivation for 
such a combination of references. An Examiner is not free to haphazardly combine references from 
disparate references to reject a claimed invention. There has to be a legitimate motivation for such a 
combination. 
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The Applicant submits that this provides additional reasons for the allowability of claim 55 and 
similarly of currently amended claim 63 over the cited references. 

Claims 62 and 71 

According to currently amended claim 62, the upmixing comprises, for each of two or more 
different frequency bands, upmixing at least two of the E transmitted channels into at least one playback 
channel in the frequency domain. In rejecting claim 62, the Examiner cited Shaffer, column 5, lines 56- 
67, as teaching "the upmixing comprises, for each of one or more different frequency bands, downmixing 
the two or more input channels in the frequency domain into one or more downmixed channels in the 
frequency domain" (emphasis added), stating further that "the decoder can apply the balance parameter to 
all received frequencies." 

First of all, the recitations of claim 62 relate to upmixing at the decoder. The Examiner 
mischaracterized the recitations of claim 62 as "the upmixing comprises ... downmixing." 

Furthermore, as described previously with regard to claim 35, whether or not Shaffer's decoder 
can apply the balance parameter to all received frequencies has nothing to do with whether or not 
upmixing is performed in the frequency domain. There is simply no teaching in Shaffer that any 
upmixing is performed in the frequency domain. 

Claims 73-77 

According to currently amended claim 73, the method comprises generating, in the frequency 
domain, ICTD data as one of the one or more cue codes for at least two of the two or more different 
frequency bands, wherein each of the at least two different frequency bands has different ICTD data . In 
rejecting previously presented claim 73, the Examiner stated that Shaffer teaches "generating, in the 
frequency domain, ICTD data as one of the one or more cue codes," citing column 4, lines 1 1-67. 

In column 4, lines 1 1-67, Shaffer teaches that "the primary left-right directional cue is ITD 
(interaural time delay) for mid-low- to mid-frequencies," while, "for higher frequencies, the primary left- 
right directional cue is ILD (interaural level differences)." At most, these teachings in Shaffer suggest 
using ITD as a directional cue below a specified cut-off frequency and using ILD as a directional cue 
above that cut-off frequency. This is different from teaching the generation of different ICTD data for 
each of two or more different frequency bands. 

As such, the Applicants submits that this provides additional reasons for the allowability of claim 
73 and similarly of currently amended claims 74-77 over the cited references. 

In view of the foregoing, the Applicant respectfully submits that the rejections of claims under 
Section 103(a) have been overcome. 
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In view of the above amendments and remarks, the Applicant believes that the now-pending 
claims are in condition for allowance. Therefore, the Applicant believes that the entire application is 
now in condition for allowance, and early and favorable action is respectfully solicited. 

Respectfully submitted, 

Customer No. 46900 Steve Mendelsohn 

Mendelsohn & Associates, P.C. Registration No. 35,95 1 

1 500 John F. Kennedy Blvd., Suite 405 Attorney for Applicant 

Philadelphia, Pennsylvania 19102 (215) 557-6657 (phone) 

(215) 557-8477 (fax) 
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